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MINIMAX  AND  DUALITY  FOR  LINEAR  £ND  NONLINEAR 
MIXED-INTEGER  PROGRAMMING 

by  Egon  Balas 

This  paper  discusses  duality  for  linear  and  nonlinear  programs 
in  which  some  of  the  variables  are  arbitrarily  constrained.  The  most 
important  class  of  such  problems  is  that  of  mixed-integer  (linear  and 
nonlinear)  programs.  Part  I  introduces  the  duality  constructions; 
part  II  discusses  algorithms  based  on  them. 


PART  I.  SYMMETRIC  DUAL  MIXED- INTEGER  PROBLEMS 
I.  The  Linear  Case 

Consider  the  pair  of  dual  linear  programs 

max  cx  min  ub 

(LP)  Ax  +  y  *  b  (LD)  uA  -  v  *  c. 

x,y  £  0  u,v  £  0 

where  A  is  an  m  x  n  matrix  and  {l,...,m}  =>  M,  {l,...,n}  =  N. 

The  main  result  of  linear  programming  duality  theory  [1]  is  that 
the  primal  problem  has  an  optimal  solution  if  and  only  if  the  dual  has 
one,  in  which  case,  denoting  the  two  optimal  solutions  by  (x,y)  and 
(u,v)  respectively,  we  have  cx  *>  ub,  and  uy  *  vx.  *  0.  These  relations 
play  a  central  role  in  linear  programming. 

We  wish  to  examine  what  happens  to  the  above  duality  properties, 
if  we  constrain  some  of  the  primal  and  dual  variables  to  belong  to  arbi¬ 
trary  sets — like,  for  instance,  the  set  of  integers.  Suppose  the  first 
n^  components  of  x  and  the  first  m^  components  of  u  (0  £  n^  <  n, 

0  £  <1  m)  are  arbitrarily  constrained,  and  the  following  notation  is 

1  112 
introduced:  (x^,...,xq  )  *  x  ,  (u^,...,um  )  *  u  ,  x  «  (x  ,x  ), 

12  ^  ^ 
u  «  (u  ,u  ),  {l,...,n^}  *  N^,  {l(...,m^}  =  M^.  Then  the  above  pair  of 

problems  becomes 


max  cx 

Ax  +  y  =  b 

x,y  £  0 

1  Y1 

x  eX 


min  ub 

uA  -  v  = 


u,v  £  0 

1 

u  CU 


c 


(LP  I) 


(LDI) 


* 


where  X*  and  U*  are  arbitrary  sets  of  vectors  in  the  n^-dimensional  and 
m^-dimensional  Euclidean  space. 

Let  us  partition  A,b,c,y  and  v  in  accordance  with  the  partitioning 


of  x  and  u: 


(1.1) 


A11  A12  \  }M1  • 

A21  A22  j  ' 

N1  N2 


b  -  (fc\b2)  ,  c  «  (c\c2) 

y  *  (y1»y2)  ,  v  -  (v1^2) 


Unless  the  constraints  x^eX*  and  u^cU1  happen  to  be  redundant,  it 
is  clear  that  cx  <  ub  for  any  pair  x,u  satisfying  (together  with  some  y,v) 
the  constraints  of  (LPI)  and  (LDI)  respectively:  a  "gap"  appears  between 
the  two  optimal  objective  function  values. 

Suppose  now  that  we  attempt  to  dispose  of  this  gap  by  "relaxing" 
each  dual  constraint  associated  with  an  arbitrarily  constrained  primal 
variable,  and  each  primal  constraint  associated  with  an  arbitrarily  con¬ 
strained  dual  variable;  in  other  words,  by  dropping  the  nonnegativity 
requirement  for  each  dual  slack  v^,  jeN^,  and  for  each  primal  slack  y^, 
icM^.  Suppose,  further,  that  while  thus  permitting  the  primal  and  dual 
constraints  jgN^,  ieM^  to  be  violated,  we  want  the  extent  of  this  viola¬ 
tion,  as  measured  by  the  weighted  sums  -v^x^-  and  u^y^  respectively,  to 
be  as  small  as  possible.  This  points  towards  replacing  the  initial  primal 
and  dual  objective  functions  by 


(1.2) 


.  ,  1  1  ,  ,  1,.  1  .11  1  .12  2. 

miiji  max  cx  +  u  y  =  min  max  cx  +  u  (b  -A  x  -A  x  ) 

u1  x  ul  x 


and 


..11  ,  .  ,  1.11  2.21  1.  1 
(1,3)  max  min  ub  -  v  x  =  max  rain  ub  *  (u  A  +u  A  -c  )x 

x1  u  x^-  u 

respectively. 

However,  it  turns  out  that  in  order  to  obtain  equality  of  the  two 
objective  functions,  the  term  -u^A^x*,  occurring  in  both  (1.2)  and  (1.3), 
has  to  be  done  away  with.  Thus,  finally  we  are  led  to  consider  the  following 
pair  of  problems: 

.  .  llx  1.11  1 

min  max  cx  +  uy  +uA  x 

1  1 
u1  x 

Ax  +  y  =  b 

(P)  x1eX1,u1eU1 

2  2  .  . 

*,y 


y  unconstrained 


,  .  i  r  l.n  l 

max  min  ub-vx  4  u  A  x 
x-  u 

uA  -  v  »  c 

1  TI1  1  V1 
u  eU  >x  «X 

2  2  ^  n 

U  ,V  2  0 

v^  unconstrained 


1  nl  1  “l 

Here,  as  before,  X  C  R  and  U  c  p  are  arbitrary  sets  of  vectors 
in  the  respective  spaces,  with  the  only  restriction  that  they  are  supposed 
to  be  independent  of  each  other  and  of  the  other  variables,  i.e,,  none  of 
them  is  supposed  to  be  defined  in  terms  of  other  problem  variables. 

Since  in  the  above  pair  cf  problems  y  is  uniquely  defined  by  x  3nd 
v  1 s  uniquely  defined  by  u,  a  solution  to  P  will  be  written  as  (x,u  ) 


and  a 


I  t  i  u  i  '  !>  as  (u ,  v  ) 


1 
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We  define  (D)  to  be  the  dual  of  (P).  It  is  easy  to  see  that  the 
duality  defined  in  this  way  is  involutory  (symmetric):  the  dual  of  the 
dual  is  the  primal.  Also,  it  is  easy  to  see  that  the  mixed=integer  linear 
program  is  a  special  case  (actually  the  most  important  special  case)  of  (P), 
namely  the  one  in  which  X  is  the  set  of  n^~vectors  with  nonnegative 
integer  components,  and  m.j*0,  i.e.,  Mj=0. 

Ihe  main  feature  of  the  above  pair  of  dual  problems  xs  the  special 
relationship  between  each  primal  variable  x^  and  the  associated  dual 
slack  v^,  and  between  each  dual  variable  u^  and  the  associated  primal 
slack  y^,  namely: 


x  arbitrarily  constrained^ - ^v.  unconstrained 

**  J 


<i.« 


y^  unconstrained 

y^o 


■^vj  £  0 

u^  arbitrarily  constrained 

"^Uj,  ^  0 


We  shall  now  state  a  lemma  which  will  be  used  in  the  proof  of  the 

next  theorem. 

X  2 

Let  s  ,s  ,..»,sp  be  elements  of  arbitrary  vector  spaces.  A  vector 
function  G(s  ,s  , ...,sp)  will  be  called  separable  with  respect  to  s^~ 

if  there  exist  vector  functions  iKs1) (independent  of  s2,...tsP),  and 

2  1 
K(a  , ,,.,sp) (independent  of  s1),  such  that 


G(sV,...,sP)  s  iKs1)  +  K(s2,...,sP)  . 


12  p 

G(s  ,s  ,...,s  )  will  be  called  componentwise  separable  with  respect 


J 

£2 L-§-»  if  each  component  g  of  G  can  be  written  either  as  g  (s1), 


or  as 


g^Cs  ,...}sp). 


-5- 


Note  that  none  of  these  definitions  implies  separability  in  each 
component  of  s^.  Obviously,  the  first  of  the  above  two  definitions  also 
applies  to  scalar  functions  (i.e.,  one-component  vector  functions). 

Let  r,s,t  be  elements  of  arbitrary  vector  spaces.  Let  f(r,s,t) 
be  a  scalar  function  and  G(r,s,t)  a  vector  function.  We  have. 

Lemma  1.1.  If  f(r,s,t)  is  separable  and  G(r,s,t)  is  componentwise 
separable  with  respect  to  r  or  s,  then 


inf  sup  |f(r,s,t) jG(r,s,t)  £  oj  =  sup  inf  jsup{f (r,s,t) |G(r,s,t)  £  0}j- 


s  r,t 


r  s 


Proof.  Suppose  f(r,s,t)  s  f^r)  +  f2(s,t),  and  the  constraint  set  can  be 
written  as  G^(r)  £0,  G2(s,t)  £  0. 

Ihen  both  sides  of  the  equality  in  the  Lemma  become 

8UP  {^(r)  JG^:;)  £  Oj  +  inf  sup  -[f2(s,t)  jG2(s,t)  <1  oj 

Similarly,  if  f(r,s,t)  =  fj(r,t)  +  f2(s)  and  constraint  set  can 
be  written  as  G, (r,t)  <;  0,  (s)  £  0,  then  both  sides  of  the  equality  can 

be  written  as 


sup  jf^(r,t) |G^(r;t)  <;  oj  +  inf  ’ f2 (s>  JG2 (s)  £  oj 
r,t  J  s' 


a 


,  r 


\ 


ShtS-tAjt. 

2  2 

To  state  our  next  theorem,  let  us  recall  that  y  and  v  are  vector 
12  12 

functions  of  x  ,x  and  u  ,u  respectively: 


2  .2  ,21  1  ,22  2 

yab-Ax-Ax  , 


2  1.12  ,  2,22  2 

v  «  u  A  +  u  A  -c 


-6- 


2  2 

Theorem  1.1.  Assume  v  (or  y)  to  be  componentwise  separable  with 
respect  to  u*  (to  x*).  Then,  if  (P)  has  an  optimal  solution  (x,u*),  there 

_2  j  —1—2 

exists  u  such  that  (u,x  ),  where  u  =  (u  ,u  ),  is  an  optimal  solution  to 
(D),  with 

c,  ,  ,  it  l.ii  l  ,  .  ii,  l.n  i 

(i.5)  min  max  cx  +  uy  +  u  A  x  *  max  min  ub-vx  +uA  x, 

u^ell^  xgX  x*cX  ucU 


(1.6) 


-2-2 

u  y  =  0  , 


-2-2  „ 
v  x  =0 


and 


.2  -1. 12.-2  -2f.  2  .21—1.  . 

(1.7)  (c  -  u  A  )x  -  u  (b  -  A  x)  =  0 

2  1 
Proof.  Suppose  v  is  componentwise  separable  with  respect  to  u  . 

2 

(An  analogous  reasoning  holds  for  the  case  when  y  is  componentwise  separable 
with  respect  to  x^). 

(D)  can  be  stated  as  the  problem  of  finding 


„  0.  /  .  f  1  1,  1.1  2  .  2  ,21  1.,  2.22  2  1.12  0 

(1.8)  w  =  max  min  min  [c  x  +u  b  +u  (b  -A  x  )  |u  A  ;>c  -u  A  jf 

1  „1  .1  ..1  u  >0  J 


x  gX  u  ell 


In  view  of  the  separability  assumption,  lemma  1.1  can  be  applied  to 

(1.8),  i.e.,  max  and  min  can  be  interchanged.  Then  we  have 
wl  1 1 1 


,,  -  .  /  11,  11  .  r  2  .,  2  .21  1,,  2  22  2  1,12,  ' 

(1.9)  w  =  min  max  ic  x  +u  b  +mm{u  (b  -A  x  )  |u  A  ;>c  -u  A 

1  1  1  1  >  ..Zs.  A  ** 


1  tIl  1  Y1 

u  GU  x  eX 


u2>0 


On  the  other  hand,  (P)  can  be  written  as  the  problem  of  finding 
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(1.10)  z  =  min  max  |c'l"x^-f-u*b^-Hnax  {(c^-u^A^)x^  [A^^x^^b^-A^^x^}} 

1  ltl  1  V1  U  X2^0  J 

u  *U  x  eX  ^ 

For  any  given  and  x^"  the  linear  programs  in  the  inner  brackets  of 

1—11  —1 

(1.9)  and  (1.10)  are  dual  to  each  other;  and  since  for  x  =  x  and  u  =  u 
_2 

the  vector  x  is  suppos  to  be  an  optimal  solution  of  the  linear  program 

in  (l.lO)--or  otherwise  (x,u^)  could  not  be  an  optimal  solution  of  (P) — 

it  follows  that  the  linear  program  in  (1.9)  also  has  an  optimal  solution 
—2  1—1  —1_2 

u  ,  and  that  for  (u,x  )  =  (u,x  ),  where  u  =  (u  ,u  ),  the  objective  function 
of  (D)  takes  on  the  value  of  z.  But  then  (u,x*)  must  be  an  optimal  solu¬ 
tion  to  (D);  for  if  it  is  not,  i.e.,  if  there  exists  some  x^eX^  such  ..nat 
w  >  z,  where 

(1.11)  w  -  min  (cV+uV-Hiiin  [u2(b2-A21il  |u2A22>c2-ulA12  ]}  ; 

,.leul 

^2 

then,  following  the  above  reasoning,  there  also  exists  a  vector  x  such 

that  (x,u*),  where  u*  is  the  value  taken  on  by  u*  in  (1.11),  is  a  feasible 

solution  to  (P)  with  an  objective  function  value  equal  to  w--which  contradicts 

the  optimality  of  (x,u*)  for  (P).  Ihis  proves  that  (1.5)  holds,  while  (1.6) 

_2  _2 

follows  from  the  fact  that  x  and  u  are  optimal  solutions  to  the  linear  pro¬ 
grams  in  the  inner  brackets  of  (1.10)  and  (1.9). 

On  the  other  hand,  from 

_2,,2  ,21_1  22.J2.  . 

u  (b  -A  x  -A  x  )  =  0 


,  2  _ 1  12  —2  22.  2 
(c  -u  A  -u  A  )x 


0 


f 


we  have  (1.7). 


According  to  the  above  theorem,  the  main  results  of  linear  programming 

duality  theory  carry  over  to  the  pair  of  dual  problems  (P)  and  (D),  provided 

2  2  11 
v  (or  y  )  is  componentwise  separable  with  respect  to  u  (to  x  ).  Denoting 

by  |Bi  j  and  |B  respectively  the  norm  of  the  i-th  row  and  of  the  j-th 

column  of  a  matrix  B,  the  above  assumption  can  also  be  expressed  as  a 

requirement  that  the  matrix  A  satisfy  the  condition  (see  Figure  1): 


(1.12)  IA!jMAjI  ■  0  •  J£N2 


This  assumption  is  obviously  a  genuine  restriction.  However,  it  does, 

not  exclude  from  the  class  of  problems  to  whic  ;he  above  results  apply  any 

of  the  special  cases  of  known  interest.  In  particular,  it  does  not  exclude 

the  general  all-integer  and  mixed-integer  linear  programs:  since  in  these 
12 

cases  =  0,  A  is  a  zero  matrix  and  the  separability  requirement  is 
satisfied. 


-9- 


Ihe  above  duality  construction  is  rooted  in  the  ideas  of  Benders  [2] 
and  Stoer  [3].  It  also  bears  some  relation  to  the  general  minimax  theorem 
of  Kakutani  [A]. 

Additional  properties  of  the  pair  of  dual  problems  (P)  and  (D)  are 
discussed  in  [5],  They  include  conditions  for  the  existence  of  feasible 
and  (finite)  optimal  solutions,  uniqueness  of  the  optimum,  the  relation¬ 
ship  between  (D)  and  the  dual  of  the  linear  program  over  the  convex  hull  of 
feasible  points  to  a  mixed-integer  program.  An  economic  interpretation 
is  also  given  in  [5]  in  terms  of  a  generalised  shadow  price  system,  in 
which  non-negative  prices  are  associated  with  each  constraint,  and  subsidies 
or  penalties  with  each  integer-constrained  variable  of  a  mixed-integer 
program.  (For  an  alternative  interpretation  of  pricing  in  integer  pro¬ 
gramming  see  [6].) 

2.  The  Nonlinear  Case 

We  now  discuss  extensions  of  the  above  duality  construction  to  the  case 
of  a  nonlinear  objective  function  and  constraints  [7], [8], [9].  This  time 
our  starting  point  is  the  pair  of  symmetric  dual  nonlinear  programs  studied 
by  Dantzig,  Eisenberg  and  Cottle  [10],  Let  K(x,u)  be  a  differentiable 
function  of  xgRn  and  ueRm,  and  let  VxK(x,u)  and  VuK(x,u)  be  the  vectors 
of  partial  derivatives  of  K  in  the  components  of  x  and  u  respectively. 

The  nonlinear  programs  of  [10]  can  then  be  stated  as 

max  K(x,,u)  -  uVuK(x,u) 

(NP)  VuK(x,u)  £  0 


x,u  ;>  o 


10- 


and 


(ND) 


oin  K(x,u)  -  xVxK(x,u) 
7K(x,u)  £  0 
x,u  £  0 


The  generality  of  this  formulation  consists  in  the  fact  that  K  can 
be  chosen  so  (see  [10])  that  the  above  pair  of  problems  reduces  to  any  of 
the  dual  programs  studied  by  Dorn  [11],  or  Cottle  [12]  or  Wolfe  [13], 
Mangasarian  [14]  and  Huard  [15]. 

The  main  result  of  [10]  is  that,  assuming  K  to  be  twice  differentiable 

in  u,  and  concave  in  x  for  each  u,  convex  in  u  for  each  x,  if  (NP)  has  an 

_  2  _ 
optimal  solution  (x,u)  such  that  the  (Hessian)  matrix  VuK(x,u)  of  second 

partial  derivatives  of  K  in  the  components  of  u,  evaluated  at  (5c,u),  is 

positive  definite,  then  (x,u)  is  an  optimal  solution  to  (ND)  and 


uVuK(x,u)  =  xVxK(x,u)  =  0 

i.e.,  the  two  objective  functions  are  equal. 

As  in  the  linear  case,  we  now  generalize  the  above  pair  of  dual 
nonlinear  programs  by  constraining  some  of  the  primal  and  dual  variables 
to  belong  to  arbitrary  sets.  Partitioning  x  and  u  in  the  same  way  as 
before  and  denoting  again  by  X^  and  arbitrary  sets  of  n^-vectors  and 
m^-vectors  respectively,  we  are  led  to  consider  the  pair  of  problems 


min  max  f  =  K(x,u)  -  u  7  -K(x,u) 
ul  x,u^  u 

V  2K(x,u)  £  0 


(P) 


1  V1  1  „1 
X  ex  ,  u  eU 


x2,u2  £  0 
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max  min  g  =  K(x,u)  -  x  7  0K(x,u) 
xl  *?,u  x2 

7  2K(x,u)  £  0 

(D)  * 

*1«X1,  u1eU1 
2  2 

x  ,u  £  0 

where  7  2K(x,u)  and  7  2K(x,u)  stand  for  the  vectors  of  partial  derivatives 

*  u  2  2 
of  K  in  the  components  of  x  and  u  respectively. 

We  define  (D)  to  be  the  dual  of  (P).  Obviously,  the  duality  defined 

in  this  way  is  symmetric  (involutory) .  It  is  easy  to  see  that  a  mixed- 

integer  nonlinear  program  is  a  special  case  of  (P),  in  which  X*  is  the 

set  of  n^-vectors  with  nonnegative  integer  components,  m^  =  0,  and 


(2.1) 


K(x,u)  =  f(x)  -  uF (x) 


with  f(x)eR  and  F(x)eR  . 

In  the  following,  we  shall  assume--as  in  the  linear  case--that  the 
1  nl  i  mi 

sets  X  C  R  and  U  c  R  ,  while  arbitrary,  are  independent  of  each  other 

and  of  the  other  variables  of  the  problem.  Also,  the  concept  of  separability 

with  respect  to  (or  x^)  will  again  be  used  in  the  sense  defined  in  section 

1,  i.e.,  it  will  not  imply  separability  in  each  component  of  u*  (or  x1). 

2  2 

When  K(x,u)  is  twice  differentiable  in  the  components  of  x  and  u  , 

2  2  _ 

let  7  „K(x,u)  and  7  ,K(x,u)  be  the  (Hessian)  matrices  of  second  partial 

x  u  2  2 

derivatives  of  K  in  the  components  of  x  .  i  u  respectively,  evaluated 

at  (x,u).  We  then  define  the  following  regularity  condition  for  (P)  and  (D) 
_  2  _ 

(a)  If  (x,u)  solves  (P) ,  7  2K(x,u)  is  positive  definite; 

u 

(b)  If  (x,u)  solves  (D) ,  722K(x,u)  is  negative  definite. 
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Denoting  the  constraint  sets  of  (P)  and  (D)  by  Z  and  W  respectively, 
we  have 

Theorem  2.1.  Assume  that 

2  1  2 

1.  K(x,u)  is  concave  in  x  for  each  x  ,u,  and  convex  in  u  for  each 
1 

x,u  . 

2  2 

2.  K(x,u)  is  twice  differentiable  in  x  and  u  ;  (P)  and  (D)  meet 
the  regularity  condition. 

3.  K(x,u)  is  separable  with  respect  to  u^  or  x1. 

Given  1,2,3,  if  (x,u)  solves  (P) ,  then  it  also  solves  (D)  and 

min  max  {f |(x,u)eZ)  =  max  min  {g|(x,u)eW} 
u*  x,uZ  x1  x^,u 

u2 . v  2K(x,u)  =  x^.v  2K(x,u)  =  0 
u  x 

Proof.  Denote 


(2.2) 

with 

(2.3) 


(2.4) 


z  *»  min  max  ff|(x,u)eZ} 
ul  x,uz 


w  *=  max  min  {g|(x,u)€W} 

X^  X^ ,U 

Assume  that  K(x,u)  is  separable  with  respect  to  u\  i.e., 

(2.5)  K(x,u)  =  KX(uX)  +  ^(x.u2) 

(An  analogous  reasoning  holds  if  K  is  separable  with  respect  to  x^.) 
Then  z  can  be  written  as 


z  *  min  max  Tk^(u^)  +  K^XjU2)  -  u2  V  -K2  (x,u2)  I V  (x,u2)^o] 
uleU1  xlexl  ^  u  u2  J 

2  2  n 
x  ,u  >0 


-13- 


or 

(2.6) 

2  *  max  min,  J 
x^X*  u^-eU1 

K^u1)  +  f2  (x1)} 

where 

(2.7) 

f  (x1)  =  max 
z  x2,u^ 

jic2 (x,u2)  -  u2v  2 
u 

and  w 

can  be  written  as 

(2.8) 

w  =  max  min  \ 
x^X1  u^U1 

[^(u1)  +  g2(x1)j- 

where 

(2.9) 

g_(x1)  =  min 

x^u2^) 

-[k2 (x,u2)  -  x27  2 

X 

For  any  given  x  ,  (2.7)  and  (2.9)  are  a  pair  of  symmetric  dual  nonlinear 
programs  of  the  type  discussed  in  [10].  Hence,  using  the  above  mentioned 
results  of  [10],  in  view  of  assumptions  1  and  2  we  have,  for  x^  =  x\ 


(2 . 10)  u2 V  ^K2 (x, u2 )  =  x2 V  , K2 (XjU2 )  =  0 


and 


(2.11)  f2(xX)  =  g2(x1) 

It  remains  to  be  shown  that  (x,u)  is  indeed  optimal  for  (D).  If  this 
is  not  the  case,  there  exists  x^e^  such  that  g2(x*)  >  g2(x*).  But  then, 
in  view  of  the  regularity  condition  for  (D),  we  have 

(2.12)  g2(x1)  =  f2 (xl)  >  fjOc1) 


I 


1 
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which  contradicts  the  optimality  of  (x,u)  for  (P). 

This,  together  with  (2. 6), (2. 8)  and  (2.11),  proves  (2.2),  whereas 
(2.3)  follows  from  (2.5)  and  (2.10). 

O.e.d. 

Assumptions  1  and  2  are  the  same  as  the  ones  required  by  Dantzig, 
Eisenberg  and  Cottle  [10]  in  the  absence  of  arbitrary  constraints,  except 
that  the  regularity  condition  is  required  in  [10]  only  for  the  primal. 
Assumption  3  is  an  additional  requirement,  which  represents  a  genuine  restric¬ 
tion.  However,  this  restriction  does  not  exclude  from  the  class  of  problems 
for  which  Theorem  2.1  holds  the  most  important  special  case,  namely, 
mixed-integer  nonlinear  programs.  Indeed,  when  m^  *  0  then  u*  disappears 
from  the  problem,  which  means  that  the  separability  requirement  is  met. 

The  assumptions  of  Theorem  2.1  can  be  weakened  for  various  specific 
functions  K(x,u).  Thus,  for 

(2.13)  K(x,u)  =  cx  +  ub  “  uAx  +  u^A^x^ 

(P)  and  (D)  become  the  pair  of  dual  problems  discussed  in  section  1.  In 
this  case  assumptions  1  and  2  can  be  dropped  (1  is  satisfied  by  definition, 

2  is  simply  not  required),  whereas  assumption  3  can  be  replaced  by  the  weaker 
separability  requirement  of  Theorem  1.1  (weaker,  since  assumption  3  would 
require  A^  or  to  be  a  zero  matrix). 

Further,  for 

(2.14)  K(x,u)  =  cx  +  ub  -  uAx  +  "'(xCx  -  uEu)  +  u^A^x* 
where 


1 


i 


(2.15)  C 


/c11  C12 
\C21  C22 


EU  E12 

e21  ,22 


22  22 

are  symmetric  matrices  of  order  n  and  m  respectively,  wiLh  C  and  E 
negative  seiri-definite  and  of  order  and  respectively,  our  pair  of 
dual  problems  becomes 


min  max  cx  +  ^  xCx  +  ^  u£u  +  u*y*  + 

12  2  2 
n*  V  it** 


u  x,u 


Ax  +  Eu  +  y  =  b 

1  V1  1  t.l 
x  gX  ,  u  el! 

2  2  2  _ 
x  ,u  ,y  £  0 

y^  unconstrained 


1  I  11 

max  min  ub  -  f-  uEu  -  f"  xCx  -  v  x 

«2.U  2  2 

uA  -  xC  -  v  --  c 


1  „1  1  V1 

u  eU  ,  x  SX 

2  2  2  _ 
u  ,x  ,v  >0 


v  unconstrained 


1.11  1 
u  A  x 


this  generalizes  the  symmetric  dual  quadratic  programs  of  Cottle  [12] 
by  letting  some  of  the  primal  and  dual  variables  to  be  arbitrarily  con¬ 
strained,  In  this  case,  the  regularity  condition  is  not  required,  and  .he 

separability  assumption  can  be  weakened,  viz.,  replaced  by  the  require- 
21  2 

ment  that  E  =  0  and  v  be  componentwise  separable  with  respect  to  u1, 
12  2 

or  C  =0  and  y  be  componentwise  separable  with  respect  to  x1. 


t 
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The  mixed- integer  quadratic  programming  problem  is  a  special  case 
of  (Pi),  in  which  is  the  set  of  n^-vectors  with  nonnegative  integer 
components,  m^  *  0  and  E  is  a  null  matrix,  (For  a  detailed  discussion 
of  the  quadratic  case,  see  [7].) 

Finally,  let  us  consider  the  case  when  K(x,u)  s  f(x)  -  uF(x), 
where  f(x)  is  a  scalar  function  and  F(x)  an  m-component  vector  function 
of  XeRn,  and  let  F(x)  =  [F^(x),F2(x)],  where  F*(x)  and  F2  (x)  have 
and  m-nn  components  respectively.  Then  our  pair  of  dual  problems 
generalizes  the  dual  nonlinear  programs  studied  by  Wolfe  [13], 
Mangasarian  [14],  and  iluard  [15]: 


min  max  f(x)  -  u  F' (x) 
u1  x 


(P2) 


F  (x)  £  0 


1  V1  1  „1 

x  CX  ,u  eu 


x2  ^  0 


(D2) 


max  min  f(x)  -  uF(x)  -  xV[f(x)  uF(x)] 

X1  X2  ,U  XL 

V  2[f(x)  -  uF (x) ]  £  0 
X 

1  V1  1 TI1 
x  eX  ,u  gU 

2  2 

x  ,u  2.  0 


Assumptions  1,2,  and  3  of  Theorem  1  are  now  to  be  maintained,  but 
the  regularity  condition  for  (P)  and  (D)  can  be  weakened  so  as  to  read: 
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2  i  9 

(a)  If  (x,u)  solves  (?/)>  the  inequality  set  F  (x  0  satisfies 

2  —2 

the  Kuhn-Tucker  constraint  qualification  [16]  at  x  *  x. 

(b)  If  (x,u)  sol  es  (D2),  the  matrix  ^[^(x)  “  uF(x)]  is  nonsingular. 

x 

Theorem  2,1  then  becomes 

Corollary  2.1.  Given  the  assumptions  1,2,3  of  Theorem  2.1,  if  (P2) 

—1  — 2  _  —1  —2 

has  an  optimal  solution  (x,u  ),  there  exists  u  such  that  (x,u)  =  (x,u  ,u  ) 

is  an  optimal  solution  to  (D2),  Conversely,  if  (D2)  has  an  optimal  solution 

(x,u)  then  (x,u^)  is  an  optimal  solution  to  (P2). 

In  both  cases,  (2.2)  and  (2.3)  hold. 

3.  Linearization  of  the  Dual 

An  undesirable  characteristic  of  the  dual  problems  (P)  and  (D) 
discussed  in  the  previous  section  is  the  presence  of  the  arbitrarily 
constrained  primal  variables  x*  in  the  dual  inequality  set.  This  was  not 
the  case  for  the  linear  problem  discussed  in  section  1. 

Now  consider  again  the  nonlinear  problem  (P)  of  section  2,  and  let 

1  1  1  2 
K(x,u)  be  also  differentiable  in  x  on  the  set  [x  gR  |x  >0}  for  each  x  ,u. 

Then  consider  the  problem  [9]: 

max  min  g'  =  K(x,u)  -  xvxK(x,u)  +  sv  jK(x,u) 

S  X,U  X 

V  2K(x,u)  <;  0 
X 

(D1)  sgX1  ,  u1eU1 

x,u2  ^  0 

where  s  is  an  n^-vector.  Let  W  be  the  constraint  set  of  (D"). 


f 
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The  inequality  set  of  (D * ) *  unlike  that  of(W,is  independent  of  the 
arbitrarily  constrained  variables  SgX^;  and  the  optimand  of  (D1),  unlike 
that  of  (D),  is  linear  in  these  same  variables  seX*.  We  shall  show, 
however,  that  with  two  additional  assumptions  (D')  is  equivalent  to  (D)„ 

In  view  of  its  linearity  in  the  arbitrarily  constrained  variables  s,  (D') 
will  be  called  the  linearized  dual  of  (P). 

Theorem  3.1.  Assume  1,2,3  as  in  Theorem  2.1  (regu1arity  also  assumed 
for  (D')),  and 

1  1  ^1  1  2 

4.  K(x,u)  is  concave  in  x  on  the  set  {x  gR  |x  0}  for  each  x  ,u. 

■I  ^*1 

5.  X  C  {seR  |s  >  0}. 

Then  the  following  statements  hold: 

a)  If  (x,u)  solves  (P) ,  then  (s,x,u),  where  "s  =  x\  solves  (D'). 

b)  If  (x,u)  solves  (D),  then  (x,iT)  solves  (P)  and  (sf,x,u),  where 
s’  =  x  ,  solves  (D1). 

c)  If  (s,x,u)  solves  (D1),  then  s  =  x*  and  (x,a)  solves  (P)  and  (D) . 

d)  In  each  of  the  cases  a),  b) ,  c), 

(3.1)  min  max  {f|(x,u)eZ)  =  max  min  {g* | (s,x,u) gW* }  =  max  min  {g|(x,u)ew}. 
u^-  x,u^  s  x,u  x^  x^,u 

Proof.  Consider  the  problem  (P * ) ,  which  clearly  is  just  another  way  of 
writing  (P)  under  assumption  5  above  (here  s  is  an  n^-vec.tor):. 

2 

min  max  ^^(xju)  -  u  v  2^(x,u) 
u^-  s,x,u  u 

V  2^(xju)  ^  0 
u 

(P  * )  X1  -  s  ^  0 

-x*  +  s  ;>  o 

SgX1  ,  u^u1 
x,u2  >  0 
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We  now  restate  (P1)  in  the  form  (P).  Let 


sV \?V,  tiV  l.  1  ‘..V1,.2/ 

12  12 
|  *  (5  ,%  )  =  (s,x)  ,  where  §  -  s  ,  §  =  x 

11  «  < Tl1 , Tf2 )  =  (u,t\t2)  ,  where  Tj1  =  u1  ,  if  *  (u2^1,^) 

H(P,T|)  =  K(x,u)  +  (t1-^)  (x1-s) 

Then  (P*)  can  be  stated  as  the  problem  (P") : 

min  max  H(|,T1)  -  if  V  2 H ( ,  TJ) 

l]1  T1 


(*") 


V  2H(§,TD  >  0 

if 

sV1  ,  Tl^U1 


T1  >  0 


We  now  write  the  dual  (D"-)  of  (P*1) : 


-2. 


max  min  H(§,T])  -  §  V  2‘d(*,T]) 

V  §2.T1  5 


(D") 


V  2H(§,7D  £  o 


sV*  ,  ifeU1 


,  T?  £  0 


which  upon  substitution  becomes 
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1  2 

max  min  ^  2K(x,u)  "  xVxK(x,u)  -  (t  -t  )s 

s  x,u,t  ,t 

V  ^K(x,u)  +  t1  -  t2  ^  0 

x1 

v  K(x,u)  £  0 

(3.2)  x 

V1  1  TI1 
sgX  ,  u  gU 

2  . 1  2  ^  A 

x,u  ,t  ,t  >  0 

Introducing  the  slack  vector  p  ;>  0  in  the  first  inequality  set  of  (3.2) 

1  2 

and  substituting  in  the  objective  function  for  t  -t  ,  we  obtain 

max  min  K(x,u)  -  xy  K(x,u)  +  sV  ^K(x,u)  +  sp 
S  X,U,p  x 

V  2K((x,u)  <  0 

(3.3)  x1  1  n1 
seX  ,  u  eU 

2 

x,u  ,p  >  0 

Since  p  is  nonnegative,  (3.3)  is  equivalent  to  (D')  in  the  sense  that 
cy)  if  (s,x,u,p)  solves  (3.3),  then  .«p  =  0  and  (s,x,u)  solves  (D1); 

p)  if  (s,x,u)  solves  (D*),  then  (s,x,u,p),  where  p  =  0,  solves  (3.3). 

Then  statement  a)  of  Theorem  3.1  follows  from  the  application  of 

Theorem  2.1  to  (P*).  Here  we  need  assumption  4,  since  x  plays  in  (P1)  the 
? 

role  of  x  in  (P). 

To  obtain  statement  b),  note  that  (s,x,u)eW',  where  s  =  x^.  Also, 
from  Theorem  2.1  applied  to  (D),  (x,u)  solves  (P),  hence  (s,x,u)  solves  (?'). 
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Slnce  (P)  is  assumed  to  meet  the  regularity  condition,  so  is  (P1), 
which  implies  that  (sT,x,u)  solves  (D1). 

Statement  c)  follows  from  the  application  of  Theorem  2.1  to  (D'). 

The  fact  that  (s,x,u)  solves  (P1)  implies  that  s  ~  and  (x,u)  solves  (P). 
Applying  again  Theorem  2.1  to  (P),  one  sees  that  (x,u)  solves  (D). 

In  each  of  the  cases  a) ,b) ,c) ; statement  d)  follows  directly  from 
the  proofs  given  above. 

Theorem  3.1  on  the  linearization  of  the  dual  constitutes  the  basis  of 
the  method  for  solving  mixed-integer  nonlinear  programs  presented  in 


section  5. 


'ART  II .  ALGORITHMS 


The  theory  presented  in  Part  I  can  be  used  for  computa¬ 
tional  purposes.  In  the  linear  case,  it  leads  to  the  same  class 
of  algorithms  to  which  Benders'  partitioning  procedure  [2] 
belongs.  We  shall  describe  a  variant  which  differs  from  Benders' 
procedure  in  that  it  requires  the  solution  of  a  single  pure  inte¬ 
ger  program  instead  of  a  sequence  of  such  programs,  and  which  is 
essentially  the  same  as  the  one  described  by  Lemke  and  Spielberg 
[17]  (The  differences  will  be  mentioned  later). 

In  the  nonlinear  case,  the  above  theory  leads  to  a  new 
algorithm  for  solving  pure  or  mixed-integer  nonlinear  programs, 
which  can  be  regarded  as  a  generalization  of  Benders'  partition¬ 
ing  procedure  (and  its  variations)  to  the  nonlinear  case. 


'  Implicit  Enumeration  for  Mixed-Integer  Linear  Program 


We  shall  consider  the  mixed-integer  programming  problem  in 
the  special  form  where  the  integer  variables  are  zero-one  varia- 
bles  ([18]  and  [7])  describe  techniques  for  bringing  any  integer  or 

mixed-integer  linear  program  to  this  form): 

,  1  .  ? 
min  c  y  +  c  x 

1  2 
A  y  +  A  x  >  b 

(P) 

yj  =  0  or  1,  j  e  N 

x  >  0,  h  e  H 


,  1  _n  2  ^p  ,  m  .  1  2 

where  c  e  R  ,  c  e  R  ,  b  e  R  ,  x  *  p  j  are  given, 

and  [1 , . . .  ,m}  —  M ,  { 1 ,  • *  * , n}  =*  N ,  { 1 ,  #  •  • ,  p  }  =  ®  • 

The  dual  of  (F)  is  then  the  problem  (see  Section  1) 
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(4.1)  Y  =  {y  e  Rn|y^  =  0  or  1,  j  e  N} 

For  each  y  e  Y,  (D)  becomes  a  linear  program  L(y)  in  u. 
One  could  therefore  solve  (D)  by  solving  L(y)  for  each  element 
y  of  the  finite  set  Y,  and  by  choosing  that  y  e  Y  which 
minimizes  the  optimal  (maximal)  solution  of  L(y) .  On  the  other 
hand,  one  could  u.>e  an  implicit  enumeration  technique  [19]  if 
one  could  generate  constraints  to  be  satisfied  by  any  y  e  Y 
which  is  a  candidate  for  optimality.  The  reason  why  this  can 
indeed  be  done,  is  that  the  inequalities  of  (D)  are  independent 
of  y. 

X  2 

Assume  we  have  solved  L(y)  for  a  sequence  y  ,  y  ,  .  ,,y^ 
of  vectors  y  e  Y.  We  shall  ignore  the  trivial  case  when  L(y) , 
and  hence  (,D)  ,  has  no  feasible  solution  (then  P  has  no  finite 


optimum) . 
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Let 

(4.2)  £l,  —  ,q}  =  Q  =  QXU  ^ 

where 

(4.3)  Qi  *  {k  e  Q]L(yk)  has  a  finite  optimum] 

Qj  =  {k  e  Q|L(yk)  has  no  finite  optimum] 

.  k 

For  k  e  let  uK  be  an  optimal  solution  of  L(y  ),  and 

let  gk  be  the  optimal  value  of  the  objective  function  of  L(yk). 

Further,  let 

(4.4)  g*  *  min  gk 

keQj 

tf 

For  k  e  Qj*  L(y  )  has  a  feasible  solution  of  the  form 

(4.5)  uk  +  Xt\  X  >  0 

k  k 

where  u  is  an  extreme  point  and  t  a  direction  vector  for 

an  extreme  ray  of  the  convex  polytope  of  feasible  solutions  to 

k  k  2 

L(y  ),  t  being  a  solution  of  the  homogeneous  system  tA  <0. 

Since  the  constraints  of  L(y)  are  independent  of  y,  any 
k  k 

optimal  solution  u  to  a  linear  program  L(y  ),  as  well  as  any 
feasible  solution  uk  +  Xtk  of  the  type  described  above,  is  a  feasible 

solution  to  all  other  linear  programs  L(y).  Hence,  we  have 
Theorem  4.1.  Any  y  e  Y  (if  one  exists)  such  that 

(4.6;  max  [u(b  -  A^y)  +  c*y|uA^  £  c^}  <  g 

u  >  0 

satisfies  the  constraints 

(4.7)  (c*  -  ukA*)y  <  g  -  u1^,  k  e  Q 


and 
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(4.8)  -t^A^y  <  -  t^b  ,  k  e  Q2 

Proof.  Suppose  y  violates  (4.7)  for  p  e  Q,  i.e. 
uPb  +  (c*  -  upA*)y  >  g 

Then,  since  uP  is  a  feasible  solution  to  L(y) , 

max  [u(b  -  A^y)  +  c^yjuA^  <  c^}  >  uP(b  -  A^y)  +  c^y  >  g* 
uX) 

which  contradicts  (4.6). 

On  the  other  hand,  if  y  violates  (4.8)  for  p  e  Q2> 
i.e. ,  if  tp(b  -  A*y)  >  0,  then  the  objective  function  of  L(y) 
can  be  increased  indefinitely  by  setting  u  =  uP  +  A.tP,  X  >  0, 
and  by  increasing  X  ,  which  again  contradicts  (4.6)  .  Q.e.d. 

We  can  now  systematically  search  the  set  Y  by  applying  the 
exclusion  tests  of  implicit  enumeration  [18],  [19]  to  the  con¬ 
straints  (4.7),  (4.8).  Whenever  a  y  e  Y  is  found  that  satisfies 
the  current  constraints,  it  is  introduced  into  the  objective 
function  of  the  linear  program  L(y)  which  is  then  post-optimised. 

This  in  turn  yields  a  new  constraint  (4.7),  and  possibly  (4.8), 

which  is  not  satisfied  by  the  current  y.  It  may  also  yield  an 

* 

improved  value  of  g  .  A  typical  iteration  of  the  algorithm 
consists  then  of  the  following  two  phases: 

I.  (Steps  1-4  below).  Using  implicit  enumeration  techniques, 

g 

find  a  vector  y  e  Y  satisfying  the  current  constraints  (4.7)  and 

(4.8) .  Then  go  to  II. 

g 

II.  (Steps  5-6  below).  Solve  (post-optimize)  L(y  ),  add  a 

new  constraint  to  (4.7)  and  possibly  to  (4.8),  and  (possibly) 

* 

update  g  Then  go  to  t  . 

Whenever  a  new  phase  I  is  started,  the'  implicit  viiumerat ion 


( 


over  the  set  Y  is  continued  from  where  it  had  been  interrupted 
at  the  end  of  the  previous  phase  I :  those  elements  of  Y  that 
had  been  excluded  as  infeasible  for  the  current  constraint  set, 
do  certainly  not  become  feasible  by  the  addition  of  new  con¬ 
straints.  The  procedure  ends  when  there  is  no  y  e  Y  satisfying 

the  current  constraints  (4.7)  and  (4.8).  Then,  if  4  0,  the 

* 

vector  y  associated  with  the  current  g  yields  an  optimal 
solution,  or,  if  =  0,  (P)  has  no  feasible  solution  at  all. 

To  discuss  the  algorithm  in  detail,  we  shall  change  the 
notation.  Q  and  will  now  be  considered  disjoint  ordered 

sets  (i.e.,  each  inequality  (4.8)  will  have  a  different  index 
from  each  inequality  (4.7)),  denoted  by  Q  and  T  respectively, 
and  the  two  sets  of  inequalities  (4.7),  (4.8)  will  be  written  as 
a  single  set 


(4.9) 


E  ct..y.  >  p.  ,  i  e  V  =  Q  li  T 
jeN  3  1 


(4.10) 


i.l  1 

u  A  -  c 


Pi  * 


for  i  e  Q 
for  i  e  T 


i  *  i 

u  b  -  g  +  e  for  i  e  Q 


for  l  e  T 


where  e  is  a  positive  number  sufficiently  small  to  enable 
us  to  replace  the  strict  inequalities  of  (4.7)  by  ordinary 
inequalities,  without  unduly  excluding  from  consideration  any 
y  €  Y.  In  other  words,  e*  can  be  any  number  satisfying 


0  '  h  •  |»tJ  -  olb| 


(4.11) 
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for  all  pairs  of  indices  j,h  such  that  4 

We  are  interested  in  generating  vectors  y  e  Y  satisfying 
(4.9).  Any  y  e  Y  will  be  called  a  solution,  and  a  solution 
satisfying  (4.9)  will  be  called  feasible.  In  the  process,  we 
shall  generate  a  sequence  of  pseudo- solutions  ,  a 

pseudo-solution  (or  partial  solution)  ^  being  defined  as  a 
set  of  0-1  value-assignments  to  some  components  of  y: 


(4.12) 


"  {y^  =  6 j  *  J  =  j  X  * '  * ' » J  q }  »  1  £  9  <  n 


where  each  represents  one  of  the  values  0  and  1. 

Let  J*  (and  J°  respectively)  be  the  set  of  those  j  e  N 
till 

such  that  the  j  component  of  y  is  assigned  by  i|i  the  value  1 
(the  value  0),  i.e., 

(4.13)  J*  -  (j  e  N|6^  =  1}  ,  J°  «  [j  e  N| 6 ^  =  0} 


and  let 


(4.14) 


N.  =  N  -  jJ  [)3? 


We  shall  say  that,  at  the  stage  characterized  by  the  pseudo¬ 
solution  y  is  fixed  at  1  if  j  e  J^,  fixed  at  0  if  j  e  j£, 

and  free  if  j  e  . 

k 

The  solution  y  defined  by 


(4.15) 


bk.  for  j  e  U  J° 
0  for  j  e  N. 


will  be  called  the  solution  associated  with  . 

In  order  to  keep  track  of  the  sequence  of  pseudo-solutions 
that  will  be  generated^we  shall  associate  with  this  sequence  an 


t 
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arborescence  (rooted  tree)  a  .  Each  node  h  of  Ol  corresponds 
to  a  pseudo-solution  ijj^,  to  a  solution  y*1  associated  with 
^  via  (4.15),  and  to  a  linear  program  L(y^).  Each  arc  (h,k) 
of  (^corresponds  to  a  pair  of  pseudo-solutions  ijr^,  such 
that  ljr^  has  been  generated  from  Since  the  generating  pro¬ 

cedure  is  such  that 


(4.16) 


C  J°  c 
Jh  C  Jk’  Jh 


1 


i.e.,  ^  is  generated  from  ^  by  fixing  at  1  a  free  component 

of  y,  an  arc  (h,k)  will  also  be  associated  with  the  (unique) 

variable  y^  which  is  -ree  at  node  h  and  fixed  at  1  at  node  k. 

For  the  same  reason,  any  pseudo-solution  ^  such  that  J*  C 

and  J°  cj°,  will  be  called  a  descendant  of  if  actually 

generated,  and  a  potential  descendant  otherwise. 

The  implicit  enumeration  procedure  that  we  are  going  to  apply 

to  the  elements  of  Y  is  based  on  the  use  of  tests  of  the  type 

introduced  in  [IS],  We  shall  assume  that  c^  >  0,  which  is  not  a 

restriction,  since  c^,  if  negative,  can  always  be  made  positive 

by  a  substitution  of  the  form  yj  =  1  -  y^ .  Further,  in  order  to 

be  able  to  use  in  this  context  tests  which  place  bounds  on  the 

value  of  the  objective  function,  we  compute  a  lower  bound  y  on 
2 

c  x  (the  existence  of  which  follows  from  that  of  a  finite  optimum 
for  (P)): 


(4.17)  Y  =  min  {c^xiA^y  +  A^x  >  b ,  0  <  y  <  1 ,  j  e  N) 

x>0  J 

We  start  with  V  =  0  which  admits  an  arbitrary  y  e  Y.  We 
choose  as  a  starting  solution  (root  of  a)  y  =  0,  and  set  g  =  4®. 

In  order  to  describe  a  typical  iteration,  let  us  suppose 
that  the  last  pseudo-solution  generated  was  ^  ,  with  the 
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associated  solution  y  satisfying  (4.9),  and  that  by  solving 
L(y  )  the  system  (4.9)  has  been  augmented  and  updated  so  that 

k 

it  is  not  satisfied  any  more  by  y  (we  shall  see  that  this  is 
the  situation  at  the  beginning  of  each  new  iteration). 

Let  J^,  J°  and  be  the  index  sets  defined  by  (4.13), 

(4.14)  associated  with  t^,  and  let 


(4.18) 

Nk+  = 

ti  ‘  >  °) 

,  =  (j  e  Nklo..  <0],ieV 

(4.19) 

A 

Pi  = 

ei  '  s  l  “it> 

i  e  V 

(4.20) 

4~ 

V’  = 

A 

{i  e  V|Pi  >  0} 

We  then  proceed 

as  follows: 

Step  1. 

Compute 

(4.21) 

"Pi  = 

h  •4i+®ii- 
j«k 

i  e  V+ 

If  >  0  for  some  i  e  V+,  backtrack  (go  to  Step  4). 
If  p  <  0,  f  i  c  V+,  go  to  Step  2. 

Step  2.  Let 


(4.22) 


P  =  max  3 
o  ieV 


Order  the  indices  j  e  N^o*  so  that 


(4.23) 


a 


< 


V] 


Of 


<  . . .  < 


y2 


a.  . 

Vt 


and  find  an  index  j^e  such  that 


5 


f 
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r=l  ~  r 

(4.24)  s  a  <  0  <  £  a,  . 

h-1  oJh  o  h=l  oJh 


Compute 


(4.25) 


Ag  -  g 


Y  -  £ 

jeJ 


1 

k 


1 

Cj 


r-1 
-  2 
h-1 


c ,  *  r-1 

-3r__  (0,  -  £  of.  ,  ) 

Of.  o  h*l  oJh 

o3r 


where  Y  is  defined  by  (4.17). 

If  Ag  <  0,  backtrack  (go  to  Step  4). 

If  Ag  >  0,  go  to  Step  3. 

Steps  1  and  2  are  exclusion  tests  meant  to  identify  such  nodes 
of  (L that  cannot  have  among  their  potential  descendants  nodes  assoc¬ 
iated  with  feasible  solutions  y  e  Y  "better"  than  the  currently 
best  one.  Thus,  in  the  first  test,  if  >  0  for  some  i  e  V+, 
then  the  i**1  constraint  cannot  be  satisfied  by  assigning  whatever 
values  (0  or  1)  to  the  free  variables.  Hence,  one  can  backtrack, 
i.e.,  abandon  the  current  node  of  Qs(i.e.,  the  current  y)  with 
all  its  potential  descendants. 

The  second  test  consists  in  choosing  the  "most  violated” con¬ 
straint,  and  computing  a  lower  bound  on  the  "cost"  of  satisfying  it 
by  assigning  values  1  to  some  of  the  free  variables.  Ag  is  the 

difference  between  g  and  this  lower  bound,  the  latter  being  ex- 

2 

pressed  as  a  sum  of  Y  (a  lower  bound  on  c  x)  and  the  rest  of 
the  expression  on  the  right-hand  side  of  (4.25)  (a  lower  bound  on 
c*y).  Hence,  if  Ag  <  0,  no  descendant  of  the  current  node  can 
yield  a  lower  value  of  g  than  g  ,  and  again  we  can  backtrack, 
i.e.,  abandon  the  current  node  with  all  its  potential  descendants. 

Other  tests  used  in  [18,  19]  or  suggested  elsewhere  in  a 


similar  context  can  also  be  introduced  at  this  point. 
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Step  3.  Generate  the  pseudo-solution  (and  the  assoi 

lated  node  of  (Xs) ,  defined  by 


«-26)  u  tJii,  j°k+1  -  j 


A 

where  is  given  by  (4.23),  and  update  g^,  i  e  V,  i.e., 


set 


(4.27) 


=  ‘  jej*  “‘J  ’ 

JeJk+l 


i  e  V 


If  >  0  for  some  i  e  V,  set  k  +  1  =  k  and  go  to 
Step  1. 

If  <  0,  i  e  V,  introduce  y8  =  y^  +  the  solution 
associated  with  into  the  objective  function  of  L(y),  and 

go  to  Step  5. 

Step  4.  Backtrack  to  the  predecessor  h  of  the  current 
node  k  in  Qj  .  Let  y^  be  the  variable  associated  with  the 
backtracking  arc.  Update  the  sets  Nh  and  J°  by  replacing 
them  through  -  {j}  and  J°  U  {j},  respectively,  i.e., 
remove  j  from  the  set  of  free  indices  by  fixing  y^  at  0. 

Go  to  Step  1.  If  backtracking  is  not  possible  (if  we  are  at 

the  root  of  CL/and  instructed  to  backtrack),  terminate: 

*  * 
if  g  <  ®,  the  solution  associated  with  g  is  optimal; 

* 

if  g  «  ®f  p  has  no  feasible  solution. 

Step  3  generates  a  new  solution  by  fixing  a  hitherto  free 
variable  at  1.  If  the  solution 

associated  with  the  new  pseudo-solution  obtained  in  this  way  is 
not  feasible,  the  tests  are  repeated.  If  it  is,  one  introduces 

g 

the  new  vector  y  into  the  objective  function  of  L(y)  and 
one  goes  to  the  step  dealing  with  L(yS). 

In  Step  4  we  backtrack  to  the  predecessor  of  the  current 


I 
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node,  and  by  fixing  at  0 

the  variable  associated  with  the  backtracking  arc  we  make  sure 
that  the  abandoned  node  and  its  potential  descendants  will 
never  be  visited  in  any  future  step. 

Step  5.  Solve  (post-optimize)  L(y  ). 

If  L(y8)  has  an  optimal  solution  u8^ 
add  to  (4.9)  the  constraint 

(4.28)  (uSA*  -  c1)y  >  uSb  -  g  +  eS 

Then,  if  g8  <  g*,  update  g*  in  all  constraints  of  type  (4.7)  by 
setting  g*  =  gS.  If  L(y8)  has  no  finite  optimum,  let  u8  +  \t8 
be  a  feasible  solution  for  any  \  £  0.  Add  to  (4.9)  the  constraint 

(4.28) ,  and  the  constraint 

(4.29)  tSAly  >  t3b 

In  all  cases,  if  |v|<  2n,  where  |V|  stands  for  the  current 
number  of  constraints  (4.9),  go  to  Step  1.  Otherwise  go  to 
Step  6. 

Step  6.  If  at  Step  5  we  have  generated  one  constraint,  drop 
from  (4,9)  the  constraint  i^  defined  by 

A  A 

(4.30)  8  *  min  0, 

*  ieV 

If  at  Step  5  two  constraints  have  been  generated,  drop  from 
(4.9)  the  constraints  i*,  defined  by  (4.30),  and  i^,  defined 
by 

A  A 

0  =  mtn  0 

**  ieV-[i*}  1 


(4.31) 
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Go  to  Step  1. 

In  Step  5  the  solution  (post-optimization)  of  L(y  )  is 
used  to  generate  one  or  two  new  constraints  for  (4.9).  If  the 
objective  function  of  L(y8)  at  the  optimum  is  smaller  than  g*, 
the  latter  is  replaced  by  the  new  value  in  all  constraints  of 
type  (4.7). 

Step  6  is  meant  to  keep  the  number  of  constraints  constant 
after  a  certain  level  has  been  reached,  by  eliminating  the 
"loosest"  constraint  (or  pair  of  constraints).  The  level  chosen 
here,  2n,  is  arbitrary,  and  can  of  course  be  changed  (the 
more  constraints  are  retained,  the  more  efficient  the  tests  tend 
to  be,  but  the  more  time  it  takes  to  apply  them). 

From  the  above  comments  it  should  be  clear  that  the  algorithm 
ends  in  a  finite  number  of  iterations.  The  solution  associated 
with  the  last  value  g  is  optimal;  if  g  =  +  ®,  P  has  no 
feasible  solution. 

Indeed,  Y  is  a  finite  set,  and  in  the  process  of  enumerating 
Its  elements  we  abandon  a  subset  of  elements  (associated  with  a 
node  of  Os  and  its  potential  descendants)  only  when  we  can  con¬ 
clude  from  the  tests  that  there  is  no  element  of  the  subset 
which  satisfies  the  current  constraints  and  is  "better"  than  the 
currently  "best"  element.  On  the  other  hand,  Theorem  4.1  shows 
that  a  vector  y  e  Y  can  possibly  be  "better"  than  the  current 
"be8t"one  only  If  it  satisfies  the  current  constraints  (4. 7), (4. 8). 
Finally,  the  implicit  enumeration  technique  is  such  that  no 
abandoned  node  can  ever  be  visited  again  -  nor  can  any  of  the 
potential  descendants  of  such  a  node  be  generated. 

The  above  algorithm  is  closely  related  to  the  partitioning 
procedure  of  Benders  [2],  The  Benders  procedure,  however,  pre- 


I 
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scribes  for  phase  I  the  finding  of  an  "optimal"  y  e  Y,  i.e., 

A 

one  that  maximizes  ,  which  implies  the  solution  of  an  integer 

o 

programming  problem  each  time  we  get  into  phase  I.  Our  pro¬ 
cedure  avoids  this,  and  requires  only  the  finding  of  a  feasible 
y  e  Y  in  each  phase  I,  so  that  the  complete  seqi  ence  of  phases  I 
amounts  to  solving  one  simple  integer  program.  This  procedure 
is  essentially  identical  with  the  one  described  by  Lemke  and 
Spielberg  [17],  with  the  following  minor  differences: 

(a)  we  work  with  L(y)  rather  than  its  dual}  which  permits  the 
use  of  a  primal  algorithm  for  the  post-optimization  required  in 
each  phase  II;  (b)  we  generate  the  lower  bounds  (4.17)  and  (4.25) 
and  use  them  in  what  seems  to  be  a  strong  exclusion  test;  (c) 
we  work  with  a  fixed  number  of  constraints  (4.9). 


5.  An  Algorithm  for  Integer  and  Mixed-Integer 
Nonlinear  Programming 

\ 

We  shall  now  discuss  a  generalization  of  the  procedure  described  in 
section  A  to  the  nonlinear  case  [7,9]. 

Consider  the  mixed-integer  nonlinear  program 

max  f(y,x) 

(P)  F(y,x)  £  0 

yeY  ,  x  £  0 

where  f(y,x)  is  a  scalar  function  and  F(y,x)  an  m-component  vector  function 
of  ycRn,  xcRp,  and  Y  c  Rn  is  the  set  of  n-vectors  with  nonnegative  integer 
components.  Ihis  is  a  special  case  of  problem  (P2)  of  section  2,  in  which 
m^  -  0. 

Let  u6Rm  and  let  the  function 

(5.1)  K(y,x,u)  s  f(y,x)  -  uF(y,x) 

be  differentiable  in  y  and  twice  differentiable  in  x. 

The  dual  of  (P),  as  defined  in  section  2,  is  then 

max  min  g  =  K(y,x,u)  -  xv  K(y,x,u) 

y  A 

(D)  VxK(y,x,u)  £  o 

y*Y  ;  x,u  £  0 

Problem  (D)  does  not  seem  to  be  of  any  use  in  solving  (P) ,  since 
its  inequality  set  contains  the  integer-constrained  primal  variables  y, 
and  its  objective  function  is  nonlinear  in  the  latter.  However,  in 
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section  3  we  have  introduced  a  linearized  (in  yeY)  dual  (D')  of  (P).  We 
.shall  use  a  slightly  different  notation  here,  in  that  we  shall  continue 
to  denote  by  y  the  integer-constrained  variable  of  the  dual,  and  shall 
let  the  newly  introduced  variable  seRn  to  be  continuous! 

max  min  g'  =  K(s,x,u)  -  (s,x)7a  K  (s,x,u)  +  yV  K(s,x,u) 
y  SjAjU  9)A  o 

(D‘)  VxK(s,x,u)  £  0 

y*Y  ;  s,x,u  £  0 

Here  v  K  =  (v  K,v  K),  7  K  being  the  vector  of  partial  derivatives 

8  jX  3  X  3 

of  K  in  the  components  of  s. 

The  inequality  set  of  (D')  is  independent  of  the  integer-constrained 
variables  y;  moreover,  the  objective  function  g'  is  linear  in  y.  Xu  view 
of  the  results  of  section  3,  this  opens  the  way  to  the  approach  of  solving 
(P)  by  solving  (D1).  To  restate  those  results  relating  (D')  to(0  for  the 
special  case  under  consideration,  we  recall  from  section  2  that  the  regularity 
condition  for  the  above  problems  (P),  (D1)  is  as  follows: 

(a)  If  (P)  has  an  optimal  solution  (y,x)>  the  inequality  set  F(y,x)  £  0 
satisfies  the  Kuhn-Tucker  [16]  constraint  qualification  at  x  ■  x. 

(b)  If  (D1)  has  an  optimal  solution  (y,s,x,u),  the  matrix 
t£k(s,x,u)  is  nonsingular. 

Denoting  by  Z  and  W'  the  constraint  sets  of  (P)  and  (D')  respectively, 
the  relevant  parts  of  Theorem  3.1  become  for  this  case 

Theorem  5.1.  Let  f(y,x)  and  each  component  of  -F(y,x)  be  differen¬ 
tiable  and  concave  in  y,x  on  the  set  {(y,x)eRnxR^ |y,x  >  0 } ,  and  assume 
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that  (P)  and  (D')  meet  the  regularity  condition.  Then 

a)  If  (y,x)  solves  (P),  there  exists  ueRm  such  that  (y,?,x,u), 
where  1  *  y,  solves  (D'). 

b)  If  (y,s,x,u)  solves  (D'),  then  y  *  §  and  (y,x)  solves  (P). 

c)  In  both  cases  a)  and  b), 


i 


i 

'K 


1 

4 


(5.2)  max  (f(y,x) |(y,x)eZ]  =  max  min  {g' j (y,s,x,u)eW'} 

y  s,x,u 

The  proof  of  this  theorem  is  along  the  same  lines  as  that  of  Theorem  3.1, 
with  the  following  observations: 

Of)  The  linearity  of  K(y,x,u)  in  u,  along  with  the  assumptions  on 
f(y,x)  and  F(y,x)  and  the  regularity  condition,  make  up  for 
assumptions  1,2,  and  4  of  Theorem  3.1.  As  to  assumption  3  of  that 
theorem,  it  is  taken  care  of  by  the  fact  that  m^  »  0.  Assumption  5 
holds  by  the  definition  of  Y. 

P)  The  regularity  condition  required  for  Theorem  3.1  can  be  replaced 
by  the  weaker  regularity  condition  stated  above,  because  the 
duality  theorems  of  Wolfe  [13]  and  Huard  [15]  can  now  replace 
the  one  by  Daoczig,  Eisenberg  and  Cottle  [10]  in  the  proof  of  the 
above  theorem. 

Remark  1.  In  the  regularity  condition  stated  above,  the  Kuhn-Tucker 
constraint  qualification  can  of  course  be  replaced  by  that  of  Slater  [20] 
or  Arrow-Hurwicz-Uzawa  [21],  or  any  other  constraint  qualification  under 
which  the  duality  theorem  of  [13]  holds.  On  the  other  hand,  if  the  regularity 
condition  for  (D')  is  replaced  by  the  weaker  "low-value  property"  requirement 
introduced  by  Mangasarian  and  Ponstein  [22],  then  the  "strict"  converse 
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duality  statement  b),  based  on  [15],  has  to  be  replaced  by  a  weaker  converse 
duality  statement  of  the  type  [22 j.  In  all  these  cases,  the  theorem  can 
still  serve  as  a  basis  for  the  algorithm  to  be  described  below. 

Remark  2.  If  reduces  to  Rn,  i.e.,  (P)  is  a  pure  integer 

nonlinear  program  in  y,  its  linearized  dual  (D1)  becomes  a  mixed-integer 
max-min  problem  (D°)  in  nonnegative  variables,  otherwise  unconstrained, 

i' 

and  linear  in  the  integer-constrained  variables: 

(D°)  max  min  K(s,u)  +  (y-s)v  K(s,u) 
yeY  s,u  2>  0  8 


Before  discussing  the  algorithm,  let  us  consider  the  case  when 

(5.3)  K(y,x,u)  ■  c^y  +  c2x  +  ub  -  u(A*y  +  A2x)  +  ^(y,x)C  (x  ) 

12  12 

where  b,c  =  (c  ,c  ) ,  A  »  (A  ,A  )  and  C  are  of  appropriate  dimensions, 
C  being  symmetric.  (P)  is  then  the  mixed-integer  quadratic  program 

121  (y\ 
max  c  y  +  c  x  +  ^(y^C  (x  ) 

(P)  A*y  +  A 2x  ^  b 

y«Y  ;  x  ^  0 


whose  dual  is 


max  min  ub  -  2(y>x)c(  x  )  -  v*y 

y  x»u 

(D)  uA  -  (y,x)C  -  v  *  c 

2  1 

yeY  ;  x,u,v  ^  0  ;  v  unconstrained 
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and  whose  linearized  dual  (D')  is 

max  min  ub 
y  t,x,u 

(D1)  uA  -  (s,x)C  -  v  *  c 

2  1 

y«Y  ;  s,x,u,v  £  0  ;  v  unconstrained 

No  regularity  condition  is  required  for  this  case,  and  Theorem  5.1 
becomes 

Theorem  5.2.  Let  C  be  negative  semi-dafinite.  Then 

a)  If  (y,x)  solves  (P),  there  exists  u eRm  such  that  (y,s,x,u), 
where  s’  =  y >  solves  (D'). 

b)  If  (y,s,x,u)  solves  (D'),  there  exists  xeR^  such  that  (y^x,)!), 
where  5T  *  also  solves  (D1),  while  (y,x)  solves  (P). 

The  proof  is  along  the  same  lines  as  for  Theorem  3. 1, with  the  use 
of  the  quadratic  duality  theorem  of  Cottle  [12]  in  place  of  the  strict  non¬ 
linear  duality  theorem  of  [10]. 

We  shall  now  discuss  a  method  for  solving  integer  or  mixed-integer 
nonlinear  programs,  based  on  the  above  results.  The  basic  idea  of  the 
method  is  to  solve  (D1)  instead  of  (P). 

We  shall  consider  the  mixed-integer  nonlinear  program  (P)  introduced 
at  the  beginning  of  this  section,  and  assume  that  f(y,x)  and  each  component 
of  -F(y,x)  is  concave  and  differentiable  in  y  and  x  on  {(y,x)eRnxR^ |y,x  >  0}. 
Further,  we  shall  assume  that  the  integer-constrained  variables  are 
bounded,  i.e.,  Y  is  finite. 


-  £(s,x)C 


(*)- 


1 

v  y 
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How  consider  the  linearized  dual  (O')  of  (P),  which  is  a  mixed- 
integer  nonlinear  problem  in  (y,s,x,u),  with  an  objective  function  linear 
in  y,  and  a  constraint  set  independent  of  y.  For  any  given  y^Y,  (D*) 
becomes  a  (continuous)  nonlinear  program  in  (s,x,u),  which  we  shall  denote 
by  D'(y). 

bet  g'(y)  be  the  objective  function  and  W11  the  constraint  set  of 

D'(y)»  i.e., 

(5.3)  W"  «  {(s,x,u) |vxK(s,x,u)  £  0  ,  (s,x,u)  £  0) 

We  assume  that  W"  /  0  (this  is  always  the  case  when  (P)  has  an  optimal 
solution  and  meets  the  regularity  condition). 

The  method  we  are  going  to  discuss  involves,  as  in  the  linear  case, 
the  solution  of  a  sequence  of  problems  D'(y)  defined  by  a  sequence  of  vectors 
ycY. 

Since  each  problem  D'(y)  is  the  dual  of  the  concave  program  P(y)  obtained 
from  (P)  by  setting  y  =  y  (see  the  proof  of  Theorem  3.1),  one  can  solve 
D(y)  by  solving  P(y)  whenever  the  latter  satisfies  (or  can  be  perturbed 
so  as  to  satisfy)  the  required  constraint  qualification.  By  "solving"  a 
problem  D'(y)  we  mean  finding  an  optimal  solution  or  an  e-solution  (in  the 
sense  defined,  for  instance,  in  [?3]),  or  establishing  the  fact  that  D'(y) 
has  no  finite  optimum.  Further,  we  shall  have  to  assume  that  at  the  end  of 
the  whole  procedure,  when  an  optimal  solution  (or  g-solution)  to  (D1)  has 
been  found,  the  regularity  condition  required  in  Theorem  5.1  holds  (or 
can  be  made  to  hold  by  some  perturbation).  However,  this  assumption  is 
not  needed  in  the  case  of  a  mixed-integer  quadratic  program,  as  it  was 


mentioned  above. 


How  suppose  we  solve  D'(y)  for  y  -  yl,...,yq,  -  Q, 

It 

(y  cY»kfQ).  For  each  kgQ,  exactly  one  of  the  following  two  situations  holds 
*)  D'(yk)  has  an  optimal  solution  (or  an  e-solution)(sk,xk,uk). 

It 

b)  g*(y  )  la  unbounded  from  below  on  W". 

For  case  b)  we  have 


JfaSfllfiB.  ?»,3«  If  g'(y  )  is  unbounded  from  below  on  W",  there  exist  vectors 

k  _n  k  _p  k  „m  .  k  m 

I  (K  i  x  *R  ,  u  eR  and  t  eR  »  such  that 


(5.4) 

(5.5) 
and 

(5.6) 


(sk,xk,uk)eW"  ,  tk  ^  0 
VxtkF(sk,xk)  >  0 

-tkF(sk,xk)  +  (sk,xk)v  tkF(sk,xk)  -  ykv  tkF(sk,xk)  <  0 

8 


Let  e  *  (l,...>l)eRm  and  let  C  ^  0  be  such  that 

/  lc  lc  1c 

K(y  »5>e)  s  f(y  »§)  “  eF(y  ,§)  is  finite.  The  existence  of  such  a 
vector  5  follows  from  the  assumption  that  f(y,x)  and  F(y,x)  are  differen¬ 
tiable  (hence  continuous).  Then  for  any  (s,x,u)eW" 


8*(yk)  £  K(s,x,u)  +  [(yk,|)  -  (s,x)'Jv  K(s,x,u) 

S|A 

[since  §vxK(s,x,u)  £  0] 

Jq 

^  K(y  ,|3u)  [by  the  concavity  of  K(s,x,u)]. 

k 

Since  K(y  ,§,e)  is  finite,  it  folic -o  that  for  any  finite  ueRm, 

K(y  »?>u)  is  also  finite,  and  g'(y  )  is  bounded  from  below.  Hence  a 

k 

necessary  condition  for  g'(y  )  to  have  no  lower  bound  on  W"  is  the  existence 
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k  k  k  k 

of  s  ,x  ,u  and  t  such  that,  if  X  is  a  scalarf 

a)  (sk,xk,uk  +  Xtk)eW*'  for  arbitrary  X  ^  0,  which  implies  (5.4)  and  (5.5) 

b)  for  (s,x,u)  =  (sk,xk,uk  +  Xtk)  and  X  £  0,  g'(yk)  is  a  decreasing 

function  of  X,  which  implies  (5.6)  Q.e.d. 


Having  solved  D’(y)  for  y  =  y  eY,  keQ  *  {l,...,q},  let  Q  »  U  with 
=  (keQ|D*(yk)  has  an  optimal  solution  (sk,xk,uk  )} 


(5.7) 


kcQ 


k  *) 

g'(y  )  is  unbounded  from  below  on  W”  and  | 


sk,xk,uk,tk  satisfy  (5.4), (5.5)  and  (5.6) 


lc  jc 

For  each  keQ,  let  g  stand  for  the  value  of  g'(y  )  for  (s,x,u)  =  (s  ,x  ,u  ), 
i.e.,  let 


(5.8) 


k  k  k 


k  k. 


K(s  ,x  ,u  )  -  (s  ,x  )v 


vf  k  k  k  k  k  k  k. 
K(s  ,x  ,u  )  4  y  V  K(s  ,x  ,u  ) 


Further,  let 


(5.9) 


g*  -  < 


k 

o 

g  *  max 
keQj^ 


k 

8 


-  CO 


if  Qx  *  0 
if  Qjl  4  0. 


Iheorem  5.4.  Any  y$Y  (if  one  exists)  such  that 


(5.10)  min  {g'(y) | (s,x,u)eW"}  >  g* 

s,x,u 


satisfies  the  constraints 
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(5.11) 

„/  (CRN  K  ,  K  v,  K  K  K. 

yv  K(s  ,x  ,u  )  >  g*  -  g  +  y  V K(s  ,x  ,u  )  , 

S  8 

keQ 

(5.12) 

-yv  tkF(sk,xk)  £  tkF(sk,xk)  -  (sk,xk)v  t^Cs^x1*) 

S  8  }A 

keQ2 

■where  ska 

,xk,uk,tk  and  gk  are  defined  by  (5.7)  and  (5.8). 

Proof.  Suppose  ygY  does  not  satisfy  (5.11)  for  peQ.  Since 
(sP,xP,uP)eW",  this  implies 

inf  £g * (y )  j(s,x,u)eWn}  £  K(sP,xP,uP)  -  (sP,xV_  R(sP,xP,uP)  +  yV  K(sP,xP,uP) 

B  »X  S 

8,X,U 

£  g* 


which  contradicts  (5.10). 

Now  suppose  ygY  violates  (5,12)  for  peQ_.  Then,  since 
(sP,xPjuP  +  \tP)eW"  for  any  X  ^  0,  we  have 


inf  {g'(y) |(s,x,u)eW"}  <  K(sP,xP,uP+XtP)-(sP,xP) v  K(sP,xP,uP+\tP)+yv  K(sP,xP,uP+\t^l 
s,x,u  8,X  8 

*  K(sP,xP,uP)- (sP,xP) V  K(sp,xp,up)+yv  K(sP,xP,uP) 

8  jX  x  8 

+  X[*tPF(sP,xP)+(oP,xP)v  tPF (sP,xP)-yv  tPF (sP,xP) ] 

8  jX  8 


But  then  in  view  of  (5.7)  and  Theorem  5.3  the  right-hand  side, 
and  hence  also  the  left-hand  side  of  ..he  above  expression  can  be  decreased 
arbitrarily  by  increasing  X,  which  contradicts  (5.10).  Q.q.d. 
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Corollarv  5.4.  If  there  is  no  yeY  satisfying  the  system  (5. 11), (5. 12), 
then  either 

a)  Q1  =  0  and  (P)  has  no  feasible  solution,  or 

b)  i  0  and  the  vector  y*eY  associated  with  the  last  g*  defines 
an  optimal  solution  to  (P). 

Proof.  If  =  0,  g'(y)  has  no  lower  bound  on  W"  for  any  yeY.  Hence 
(Theorem  1,  [13])  the  dual  of  the  convex  program  D’(y)  has  no  feasible 
solution  for  any  yeY,  and  so  (P)  itself  has  no  feasible  solution. 

If  +  0,  denote  by  (s*,x*,u*)  the  optimal  solution  to  D'(y*).  Then, 
if  (D')  meets  the  regularity  condition,  (y*,x*)  is  an  optimal  solution  to  (P) 
(Theorem  5.1).  If  not,  and  if  the  regularity  condition  is  not  required 
(like  in  the  quadratic  case),  then  the  optimal  solution  to  the  concave 
(quadratic)  program  P(y*)  obtained  from  (P)  by  setting  y  =  y*  is  also  an 
optimal  solution  to  (P) (Theorem  5.2).  O.e.d. 

Based  on  the  above  results,  we  can  now  formulate  a  procedure  for 
solving  integer  or  mixed-integer  nonlinear  programs  with  the  required 
properties  (shown  in  Theorems  5.1  and  5.2),  which  generalizes  to  these 
cases  the  algorithm  discussed  in  section  4, 

Phase  I.  Find  ySeY  satisfying  the  linear  inequalities  (5.11) , (5.12). 
(At  the  start  this  constraint  set  is  vacuous;  thus  y^eY  is  arbitrary.) 

Go  to  Phase  II. 

Phase  II.  Solve  D'(y  ).  If  it  has  an  optimal  solution  (e-solution) , 
generate  a  constraint  (5.11)  and,  if  gS  >  g*,  update  g*  (i.e.,  set  g*  =  g8). 

If  gk(yS)  ha?  no  lower  bound  on  Wn,  generate  a  constraint  (5.11)  and  a 
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constraint  (5.12).  Then  go  to  phase  I, 

Theorem  5.5.  In  a  finite  number  of  iterations,  the  algorithm  consisting 
of  phases  I  and  II  ends  with  the  set  (5.12) ,(5,13)  having  no  feasible 
solution  yeY. 

J*&o£.  When  a  new  constraint  (5.12)  or  (5.13)  is  generated  in  phase  II,  it 
is  violated  by  the  last  yeY  found  in  phase  I.  Hence  no  constraint  is  gene¬ 
rated  twice  (a  new  constraint,  violated  by  y,  cannot  be  identical  with  any 
of  the  old  ones,  satisfied  by  y);  and  no  yeY  is  generated  twice  (a  new  yeY, 
satisfying  all  current  constraints,  cannot  be  identical  with  any  of  the  old 
ones,  each  of  which  violates  at  least  one  of  the  current  constraints).  Since 
Y  is  assumed  to  be  finite,  the  theorem  follows. 

fiemajk.  This  proof  is  valid  as  long  as  all  the  constraints  generated 
under  the  procedure  are  kept  and  used  in  each  phase  I.  If  they  are  not, 
convergence  will  depend  on  the  non- redundancy  (convergence)  of  the  proce¬ 
dure  for  generating  the  elements  of  the  finite  set  Y,  as  in  the  case  of  the 
algorithm  of  section  4.  On  the  other  hand,  it  is  easy  to  see  that  the  above 
convergence  proof  is  not  affected  if  in  phase  II,  whenever  g'(y)  has  no 
lower  bound  on  W",  we  generate  only  a  constraint  (5.12),  instead  of  also 
generating  a  constraint  (5.11).  This  may  sometimes  be  preferable  [7],  as  a 
direction  vector  t  may  be  easier  to  obtain  than  the  associated  feasible 
solution  (s8,xS,uS)  to  D'(yS). 

The  procedure  outlined  above  can  be  implemented  in  several  ways. 

Phase  I  is  a  search  for  a  solution  y  to  the  constraints  (5. 11) ,(5.12) 

over  the  set  Y.  As  shown  in  section  4,  this  search  is  not  to  be  restarted 

i 

f 
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from  the  beginning  for  each  phase  I;  rather  the  successive  applications  of 
phase  I  should  constitute  successive  stages  of  a  single  search  process  over  Y. 
Y  ■  £0,l}n,  the  implicit  enumeration  techniques  known  for  linear  programs 
in  0-1  variables,  with  their  various  exclusion  tests,  can  be  used  here  as 
in  section  4.  If  Y  is  the  set  of  nonnegative  integers,  then  a  technique 
of  the  type  discussed  in  [16],  p.  942-943,  or  in  [7],  can  be  used  to  trans¬ 
form  the  problem  in  integer  variables  into  one  in  0-1  variables  at  a  rela¬ 
tively  modest  price  in  terms  of  problem  size,  and  the  implicit  enumeration 
techniques  are  again  applicable. 

As  to  phase  II,  from  a  computational  standpoint  it  seems  preferable, 

whenever  it  is  possible  (see  Theorems  5. 1,5.2),  to  find  an  optimal  solution 

s  s  s 

to  D'(y  )  by  solving  the  problem  P(y  )  obtained  from  (P)  by  setting  y  =  y  . 

g 

If,  for  some  sgQ^,  P(y  )  does  not  satisfy  the  constraint  qualification  at 

the  optimum,  the  optimal  solution  of  P(y  )  may  still  yield  an  evolution 

S  8 

to  D'(y  ).  Should  this  not  be  the  case,  the  current  y  can  simply  be 

dropped  and  another  ygY  generated.  This  will  not  affect  the  convergence 

g 

of  the  procedure,  provided  one  makes  sure  that  y  is  not  repeated. 

Ihis  procedure  is  perfectly  valid  (in  fact,  considerably  simplified) 
in  the  special  case  when  all  the  variables  of  (P)  are  integer-constrained. 

The  inequality  set  of  (D*)  is  then  vacuous,  and  (D')  becomes  the  problem 
(D°)  shown  in  Remark  2  to  Theorem  5.1.  Since  the  concavity  of  K(e,u)  in  s 
implies  the  relation 

(5.13)  K(s,u)  +  (y-s)v  K(s,u)  £  K(y,u) 

s 

which  holds  as  an  equality  for  s  =  y,  phase  II  reduces  to  solving  the 
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o  lc 

problem  D  (y  )  in  u:  i  i  1 

«  '•  * 

D°(yC)  min  K(yk,u)  s  min  {f(yk)  -  uF(yk)|u  ^  0} 

u£0  u 

|r  Jr  O  If 

Whenever  F(y  )  £  0,  u  *  0  solves  D  (y  ),  and  a  constraint  (5.11)  which 
now  becomes 

i 

(5.14)  yv8f(yk)  >  g*  -  f(yk)  +  ykv8f(yk) 

is  generated  for  phase  I.  Whenever  F^(j^)  >  0  for  ieM+,  K(y^,u)  has  no  j 

lower  bound  on  [ueRm|u  0}.  Then  the  vector  tk  such  that  tk  =  1  for  ieM+ 

k  +  1 

and  tj  «  0  for  ieM  defines  a  constraint  of  type  (5.12)  for  phase  I. 

A  detailed  discussion  of  the  above  algorithm  as  specialized  to  integer 

and  mixed-integer  quadratic  programming,  along  with  numerical  examples,  J 

is  given  in  [7], 

* 

t 

We  shall  now  briefly  explore  the  relationship  of  the  procedure  des¬ 
cribed  in  this  section  to  some  other  methods. 

As  mentioned  above,  our  method  can  be  viewed  as  a  generalization  for 
the  nonlinear  case  of  the  ideas  underlying  the  partitioning  procedure  of 
Benders  [2]  or  the  closely  related  technique  of  Lemke  and  Spielberg  [17], 

While  Benders'  partitioning  procedure  is  generally  used  for  solving 
mixed-integer  linear  programs,  it  is  in  fact  slightly  more  general  than 
that.  Benders  partitions  a  mixed-variables  program  into  two  subproblems: 
a  linear  program  (say,  p*)  and  a  more  general  problem  (say,  p2  ,  which 
may  be,  for  instance,  an  integer  program--whether  linear  or  not);  then  ' 

he  solves  the  original  problem  by  solving  a  sequence  of  subproblems  P* ,  j 

2 

P  .  But  this  partitioning  method  is  subject  to  the  following  limitations 

(also  valid  for  the  Lemke-Spielberg  algorithm):.  I 


1.  The  objective  function  and  each  constraint  has  to  be  separable 
with  respect  to  the  continuous  variables,  i.e.,  no  term  containing  both 
integer  and  continuous  variables  is  allowed. 

2.  The  objective  function  and  the  constraints  have  to  be  linear  in 
the  continuous  variables. 

3.  If  the  objective  function  and/or  the  constraints  are  not  linear  in 

2 

the  integer  variables,  then  the  subproblem  P  will  be  a  pure  integer  nonlinear 
program  for  which  a  solution  method  has  yet  to  be  found. 

The  algorithm  described  in  the  present  paper  does  not  have  any  of  these 

limitations:  1  and  2  are  not  required,  and  3  does  not  apply:  our  corres- 

2 

pondent  of  Benders '  subproblem  P  is  a  pure  Integer  linear  program. 

Furthermore,  while  Benders'  partitioning  method  becomes  meaningless 
when  applied  to  a  pure  integer  linear  program  (it  replaces  the  integer 
program  with  itself),  the  algorithm  discussed  in  the  previous  section 
replaces  an  integer  nonlinear  program  by  an  integer  linear  program. 

We  shall  now  discuss  the  relationship  between  our  method  and  the  cutting 
plane  method  of  Kelley  [24]  for  nonlinear  programming,  which,  as  Kelley 
has  shown,  can  be  combined  with  Gomory's  [25]  cutting  plane  method  for  integer 
programming.  The  constraints  (5.11) ,(5,12)  generated  in  our  procedure 
are  hyperplanes  that  cut  off  portions  of  the  set  Y  containing  the  current  yeY, 
hence  they  can  also  be  regarded  as  "cutting  planes".  But  there  are  some 
basic  differences: 

1,  Kelley's  method  generates  a  sequence  of  points  outside  the 
feasible  set,  which  converges  on  a  feasible  point.  The  first  point  which 
is  feasible,  is  also  optimal,  but  no  feasible  point  is  available  before 
the  end  of  the  procedure.  In  this  sense  it  is  a  "dual"  method.  The  same 
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is  true  when  Kelley’s  method  is  combined  with  Gomory's  one  to  solve 
an  integer  nonlinear  program  (in  this  case  of  course  "feasible"  means 
a  solution  which  is  also  integer  in  the  required  components). 

On  the  other  hand,  the  method  described  in  this  paper  generates 
a  finite  sequence  of  feasible  and  (occasionally)  infeasible  (but  integer 
in  the  required  components)  points,  with  a  subsequence  of  feasible  points  such 
that  each  point  in  the  subsequence  is  strictly  "better"  than  the  previous 
one.  At  each  stage,  a  currently  "best"  feasible  solution  is  available. 

In  this  sense  this  is  a  "primal"  method. 

2.  Kelley's  cutting  hyperplanes  define  a  convex  set  S'  containing 

the  original  constraint  set  S.  The  role  of  each  newly  generated  hyperplane 
is  to  cut  off  a  portion  of  the  set  S'-S  containing  the  current  (infeasible) 

solution.  Similarly,  Gomory's  hyperplanes  are  meant  to  cut  off  a  portion 

of  the  set  S'-S",  where  S"  is  the  convex  hull  of  the  feasible  integer  points. 
Thus,  both  types  of  hyperplanes  cut  off  sets  of  points  lying  outside  the 
feasible  (integer-feasible)  set. 

In  our  procedure,  two  types  of  hyperplanes  are  generated.  Both  of 
them  are  hyperplanes  in  n-space,  rather  than  (n+p)-space,  i.e.,  in  the 
space  of  the  integer-constrained  variables  rather  than  the  space  of  all 
variables,  and  they  are  used  as  constraints  on  the  (and  only  on  the)  integer- 
constrained  variables  yeY.  The  main  role  belongs  to  the  hyperplanes  of  type 
(5.11),  which  are  meant  to  cut  off  as  large  a  portion  of  Y  (whether  feasible 
or  not)  as  a  hyperplane  containing  the  current  point  y  can  possibly  cut  off 
without  cutting  off  any  veY  which  could  yield,  in  conjunction  with  an 
appropriate  x,  a  "better"  integer-feasible  solution  than  the  current  "best" 
one.  When  hyperplanes  of  the  type  (5.12)  are  generated,  they  are  meant 
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to  cut  off  portions  of  Y  containing  points  which  cannot  yield,  in 

conjunction  with  any  x,  a  feasible  solution, 

\ 

3.  In  Kelley's  procedure,  a  cutting  plane  is  generated  by  replacing 
a  constraint  function  by  its  first  order  Taylor  seties  approximation  in 
the  neighborhood  of  the  current  solution.  In  the  notation  of  this  section, 
this  would  be 

^(yvt)  +  [(y,x)  -  (y,x)]^^(y,x)  £  o 

The  dual  problem  does  not  play  any  role  in  the  derivation  of  this 
constraint. 

To  give  a  comparable  interpretation  to  the  cutting  planes  generated 
in  our  procedure,  consider  the  Lagrangian  expression  associated  with  the 
primal  i  >blem 


K(y>x,u)  =  f (y,x)  -  uF(y,x)  . 

If  the  current  integer  point  y  (in  n-space)  is  such  that  the  function 
K(y,x,u)  in  (x,u)  has  a  saddle-point  at  (x,u),  we  generate  a  cutting 
plane  by  requiring  the  first  order  Taylor  series  approximation  of  K(y,x,u) 
(considered  as  a  function  in  y  defined  on  {y|y  ^  0})  in  the  neighborhood 
of  y  m  y  *  ~£  to  satisfy 

(5.15)  K(s,x,u)  +  (y-y)v  K(s,x,u)  >  g* 

3 

where  g*  is  defined  by  (5.9).  It  is  easy  to  see  that  (5.15)  is  the  same  as  (5 
If  K(y,x,u)  has  no  saddle-point  and  x,u  and  t  are  such  that 
K(y,x,u  +  \t)  ®  when  X  -*  +  then  two  cutting  planes  are  generated, 

one  of  the  type  (5.11)  and  a  second  one  of  the  type  (5.12).  In  each  case 


the  dual  vector  u  (or  t)  plays  a  key  role  in  generating  the  constraints. 

Hence,  while  our  method  also  generates  a  certain  type  of  cutting 
planes,  it  differs  substantially  from  Kelley's. 
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